ANAWIKI: Creating Anaphorically Annotated Resources through Web Cooperation

نویسندگان

  • Massimo Poesio
  • Udo Kruschwitz
  • Jon Chamberlain
چکیده

The ability to make progress in Computational Linguistics depends on the availability of large annotated corpora, but creating such corpora by hand annotation is very expensive and time consuming; in practice, it is unfeasible to think of annotating more than one million words. However, the success of Wikipedia, the ESP game, and other projects shows that another approach might be possible: collaborative resource creation through the voluntary participation of thousands of Web users. ANAWIKI is a recently started project that will develop tools to allow and encourage large numbers of volunteers over the Web to collaborate in the creation of annotated corpora (in the first instance, of a corpus annotated with semantic information about anaphoric relations) through a variety of interfaces.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Addressing the Resource Bottleneck to Create Large-Scale Annotated Texts

Large-scale linguistically annotated resources have become available in recent years. This is partly due to sophisticated automatic and semiautomatic approaches that work well on specific tasks such as part-ofspeech tagging. For more complex linguistic phenomena like anaphora resolution there are no tools that result in high-quality annotations without massive user intervention. Annotated corpo...

متن کامل

(Linguistic) Science Through Web Collaboration in the ANAWIKI Project

Despite the impressive progress made in recent years in all areas of natural language processing there are still tasks that do not perform well enough to be used in everyday applications. One example is anaphora resolution. The most promising approach to get significant improvements in this area is to create sufficiently large linguistically annotated resources which can then be used to train, ...

متن کامل

Constructing an Anaphorically Annotated Corpus with Non-Experts: Assessing the Quality of Collaborative Annotations

This paper reports on the ongoing work of Phrase Detectives, an attempt to create a very large anaphorically annotated text corpus. Annotated corpora of the size needed for modern computational linguistics research cannot be created by small groups of hand-annotators however the ESP game and similar games with a purpose have demonstrated how it might be possible to do this through Web collabora...

متن کامل

A new life for a dead parrot: Incentive structures in the Phrase Detectives game

In order for there to be significant improvements in certain areas of natural language processing (such as anaphora resolution) large linguistically annotated resources need to be created which can be used to train, for example, machine learning systems. Annotated corpora of the size needed for modern computational linguistics research cannot however be created by small groups of hand-annotator...

متن کامل

Creating Digital Language Resources

We discuss building digital language resources (such as annotated corpora, lexicons, ontologies, terminologies, tools), which are the main prerequisite for successful communication and information management in the e-society of the 21 century. We give an overview of the main requirements and best practices, and point to necessary steps for creation and maintenance of standardsbased and reusable...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008